Goto

Collaborating Authors

 active dimension



e04101138a3c94544760c1dbdf2c7a2d-Paper-Conference.pdf

Neural Information Processing Systems

For example, while prior work has suggested that theglobally optimal VAEsolution canlearn thecorrect manifold dimension, anecessary (butnotsufficient)condition forproducing samplesfrom the true data distribution, this has never been rigorously proven. Moreover, it remains unclear how such considerations would change when various types of conditioning variablesare introduced, or when the data support is extended to a union of manifolds (e.g., as is likely the case for MNIST digits and related). In this work, we address these points by first proving that VAE global minima are indeed capable of recovering the correct manifold dimension.


BONSAI: Bayesian Optimization with Natural Simplicity and Interpretability

Daulton, Samuel, Eriksson, David, Balandat, Maximilian, Bakshy, Eytan

arXiv.org Machine Learning

Bayesian optimization (BO) is a popular technique for sample-efficient optimization of black-box functions. In many applications, the parameters being tuned come with a carefully engineered default configuration, and practitioners only want to deviate from this default when necessary. Standard BO, however, does not aim to minimize deviation from the default and, in practice, often pushes weakly relevant parameters to the boundary of the search space. This makes it difficult to distinguish between important and spurious changes and increases the burden of vetting recommendations when the optimization objective omits relevant operational considerations. We introduce BONSAI, a default-aware BO policy that prunes low-impact deviations from a default configuration while explicitly controlling the loss in acquisition value. BONSAI is compatible with a variety of acquisition functions, including expected improvement and upper confidence bound (GP-UCB). We theoretically bound the regret incurred by BONSAI, showing that, under certain conditions, it enjoys the same no-regret property as vanilla GP-UCB. Across many real-world applications, we empirically find that BONSAI substantially reduces the number of non-default parameters in recommended configurations while maintaining competitive optimization performance, with little effect on wall time.


Learning Manifold Dimensions with Conditional Variational Autoencoders

Neural Information Processing Systems

Moreover, it remains unclear how such considerations would change when various types of conditioning variables are introduced, or when the data support is extended to a union of manifolds (e.g., as is likely the case for MNIST digits and related). In this work, we address these points by first proving that V AE global minima are indeed capable of recovering the correct manifold dimension.


Anant-Net: Breaking the Curse of Dimensionality with Scalable and Interpretable Neural Surrogate for High-Dimensional PDEs

Menon, Sidharth S., Jagtap, Ameya D.

arXiv.org Artificial Intelligence

Physics-informed deep learning (PIDL) represents a rapidly advancing framework that integrates known governing physical laws, typically formulated as PDEs, into the training process of deep neural networks. In contrast to conventional data-driven models that rely solely on observational data, PIDL incorporates physical constraints to guide learning, thereby enhancing generalization, reducing data dependence, and improving interpretability. This synthesis of physics and deep learning has demonstrated broad applicability in solving forward and inverse problems across scientific and engineering domains, particularly in scenarios involving limited, noisy, or deceptive data. Key methodologies under the PIDL umbrella include physics-informed neural networks (PINNs) [1, 2, 3, 4], which embed PDE constraints via automatic differentiation; sparse identification of nonlinear dynamics (SINDy) [5, 6], which infers governing equations by promoting sparsity in learned representations; and physics-informed neural operators [7, 8, 9, 10, 11], which approximate solution operators across function spaces to model families of PDEs. These approaches are particularly well-suited for high-dimensional problems, where traditional numerical solvers suffer from the curse of dimensionality. High-dimensional PDEs are integral to various scientific and engineering domains, including quantum mechanics, financial mathematics, and optimal control. Their solutions provide crucial insights into complex, multi-scale phenomena that cannot be accurately captured using lower-dimensional approximations. However, solving these equations efficiently remains a significant challenge due to the curse of dimensionality, the exponential growth in computational complexity and data requirements as the number of dimensions increases.




Sparse Autoencoders, Again?

Lu, Yin, Zhu, Xuening, He, Tong, Wipf, David

arXiv.org Artificial Intelligence

Is there really much more to say about sparse autoencoders (SAEs)? Autoencoders in general, and SAEs in particular, represent deep architectures that are capable of modeling low-dimensional latent structure in data. Such structure could reflect, among other things, correlation patterns in large language model activations, or complex natural image manifolds. And yet despite the wide-ranging applicability, there have been relatively few changes to SAEs beyond the original recipe from decades ago, namely, standard deep encoder/decoder layers trained with a classical/deterministic sparse regularizer applied within the latent space. One possible exception is the variational autoencoder (VAE), which adopts a stochastic encoder module capable of producing sparse representations when applied to manifold data. In this work we formalize underappreciated weaknesses with both canonical SAEs, as well as analogous VAEs applied to similar tasks, and propose a hybrid alternative model that circumvents these prior limitations. In terms of theoretical support, we prove that global minima of our proposed model recover certain forms of structured data spread across a union of manifolds. Meanwhile, empirical evaluations on synthetic and real-world datasets substantiate the efficacy of our approach in accurately estimating underlying manifold dimensions and producing sparser latent representations without compromising reconstruction error. In general, we are able to exceed the performance of equivalent-capacity SAEs and VAEs, as well as recent diffusion models where applicable, within domains such as images and language model activation patterns.


Enhancing Interpretability of Sparse Latent Representations with Class Information

Abiz, Farshad Sangari, Hosseini, Reshad, Araabi, Babak N.

arXiv.org Artificial Intelligence

Variational Autoencoders (VAEs) are powerful generative models for learning latent representations. Standard VAEs generate dispersed and unstructured latent spaces by utilizing all dimensions, which limits their interpretability, especially in high-dimensional spaces. To address this challenge, Variational Sparse Coding (VSC) introduces a spike-and-slab prior distribution, resulting in sparse latent representations for each input. These sparse representations, characterized by a limited number of active dimensions, are inherently more interpretable. Despite this advantage, VSC falls short in providing structured interpretations across samples within the same class. Intuitively, samples from the same class are expected to share similar attributes while allowing for variations in those attributes. This expectation should manifest as consistent patterns of active dimensions in their latent representations, but VSC does not enforce such consistency. In this paper, we propose a novel approach to enhance the latent space interpretability by ensuring that the active dimensions in the latent space are consistent across samples within the same class. To achieve this, we introduce a new loss function that encourages samples from the same class to share similar active dimensions. This alignment creates a more structured and interpretable latent space, where each shared dimension corresponds to a high-level concept, or "factor." Unlike existing disentanglement-based methods that primarily focus on global factors shared across all classes, our method captures both global and class-specific factors, thereby enhancing the utility and interpretability of latent representations.


Leveraging Axis-Aligned Subspaces for High-Dimensional Bayesian Optimization with Group Testing

Hellsten, Erik, Hvarfner, Carl, Papenmeier, Leonard, Nardi, Luigi

arXiv.org Machine Learning

Bayesian optimization (BO ) is an effective method for optimizing expensive-to-evaluate black-box functions. While high-dimensional problems can be particularly challenging, due to the multitude of parameter choices and the potentially high number of data points required to fit the model, this limitation can be addressed if the problem satisfies simplifying assumptions. Axis-aligned subspace approaches, where few dimensions have a significant impact on the objective, motivated several algorithms for high-dimensional BO . However, the validity of this assumption is rarely verified, and the assumption is rarely exploited to its full extent. We propose a group testing ( GT) approach to identify active variables to facilitate efficient optimization in these domains. The proposed algorithm, Group Testing Bayesian Optimization (GTBO), first runs a testing phase where groups of variables are systematically selected and tested on whether they influence the objective, then terminates once active dimensions are identified. To that end, we extend the well-established GT theory to functions over continuous domains. In the second phase, GTBO guides optimization by placing more importance on the active dimensions. By leveraging the axis-aligned subspace assumption, GTBO outperforms state-of-the-art methods on benchmarks satisfying the assumption of axis-aligned subspaces, while offering improved interpretability.